AIbase
Home
AI Tools
AI Models
MCP
AI NEWS
EN
Model Selection
Tags
Multimodal Task Processing

# Multimodal Task Processing

Openvla 7b Oft Finetuned Libero Spatial
MIT
OpenVLA-OFT is an optimized vision-language-action model that significantly improves the running speed and task success rate of the basic OpenVLA model through fine-tuning technology.
Multimodal Fusion Transformers
O
moojink
2,513
3
Vitucano 2b8 V1
Apache-2.0
ViTucano is the first natively Portuguese pre-trained visual assistant, combining visual understanding and language capabilities, suitable for multimodal tasks such as image captioning and visual question answering.
Image-to-Text Transformers Other
V
TucanoBR
86
5
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
English简体中文繁體中文にほんご
© 2025AIbase